[Management/Client] Trigger debug bundle runs from API/Dashboard#4592
[Management/Client] Trigger debug bundle runs from API/Dashboard#4592pappz merged 51 commits intofeature/remote-debug-releasefrom
Conversation
…#4367) * new bi-directional stream for jobs * create bidirectional job channel to send requests from the server and receive responses from the client * fix tests * fix lint and close bug * fix lint * clean up & fix close of closed channel * add nolint:staticcheck * remove some redundant code from the job channel PR since this one is a cleaner rewrite * cleanup removes a pending job safely * change proto * rename to jobRequest * apply feedback 1 * apply feedback 2 * fix typo * apply feedback 3 * apply last feedback
fix lint clean up fix MarkPendingJobsAsFailed apply feedbacks 1 fix typo change api and apply new schema fix lint fix api object clean switch case apply feedback 2 fix error handle in create job get rid of any/interface type in job database fix sonar issue use RawJson for both parameters and results running go mod tidy update package fix 1 update codegen fix code-gen fix snyk fix snyk hopefully
…ne (#4428) * integrate api integrate api with stream and implement some client side * fix typo and fix validation * use real daemon address * redo the connect via address * Refactor the debug bundle generator to be ready to use from engine (#4469) * fix tests * fix lint * fix bug with stream * try refactor status 1 * fix convert fullStatus to statusOutput & add logFile * fix tests * fix tests * fix not enough arguments in call to nbstatus.ConvertToStatusOutputOverview * fix status_test * fix(engine): avoid deadlock when stopping engine during debug bundle * use atomic for lock-free * use new lock --------- Co-authored-by: Zoltan Papp <zoltan.pmail@gmail.com>
WalkthroughAdds peer job support (bundle workloads) end-to-end: new Job proto/HTTP APIs, manager/channel persistence, client-side bundle generation/uploader, job streaming in management gRPC, status → proto conversion, and log-path propagation through client/engine/connect flows. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant API as Account API (HTTP)
participant JobMgr as JobManager
participant GRPCSrv as Management gRPC Server
participant GRPCCli as Client (Job stream)
participant Exec as Job Executor
participant Uploader as UploadDebugBundle
API->>GRPCSrv: POST /peers/{id}/jobs (CreateJob)
GRPCSrv->>JobMgr: SendJob(account, peer, JobRequest)
JobMgr->>GRPCCli: enqueue JobRequest -> peer channel
GRPCCli->>Exec: receive JobRequest (bundle)
Exec->>Uploader: UploadDebugBundle(ctx, mgmURL, bundleFile)
Uploader-->>Exec: upload_key
Exec-->>GRPCSrv: JobResponse(upload_key)
GRPCSrv->>JobMgr: HandleResponse(JobResponse)
JobMgr->>API: persist completion (CompletePeerJob)
sequenceDiagram
autonumber
participant Daemon as Local Daemon
participant Connect as ConnectClient
participant Engine as Engine
participant DebugGen as BundleGenerator
participant Uploader as UploadDebugBundle
Daemon->>Connect: Run(runningChan, logPath)
Connect->>Engine: NewEngine(..., LogPath, ProfileConfig)
Engine->>DebugGen: Generate(bundleDeps, params)
DebugGen->>DebugGen: create status.txt from recorder (if present)
DebugGen->>Uploader: UploadDebugBundle(mgmURL, bundlePath)
Uploader-->>DebugGen: upload_key
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Potential hotspots to inspect:
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
client/internal/debug/debug.go (2)
346-355: Gate systemd fallback by OS to avoid noisy warnings.Currently falls back to systemd logs even on non‑Linux. Guard it.
Apply:
- if g.logPath != "" && !slices.Contains(util.SpecialLogs, g.logPath) { + if g.logPath != "" && !slices.Contains(util.SpecialLogs, g.logPath) { if err := g.addLogfile(); err != nil { log.Errorf("failed to add log file to debug bundle: %v", err) - if err := g.trySystemdLogFallback(); err != nil { + if runtime.GOOS == "linux" { + if err := g.trySystemdLogFallback(); err != nil { + log.Errorf("failed to add systemd logs as fallback: %v", err) + } + } else { + log.Debug("systemd journal fallback skipped: unsupported OS") + } log.Errorf("failed to add systemd logs as fallback: %v", err) } } - } else if err := g.trySystemdLogFallback(); err != nil { - log.Errorf("failed to add systemd logs: %v", err) - } + } else if runtime.GOOS == "linux" { + if err := g.trySystemdLogFallback(); err != nil { + log.Errorf("failed to add systemd logs: %v", err) + } + } else { + log.Debug("no explicit logPath; systemd journal fallback skipped on non-Linux") + }
668-671: Darwin stdout/err paths appear swapped.stdErrLogPath gets “out.log” and stdoutLogPath gets “err.log”. Swap constants.
Apply:
- darwinErrorLogPath = "/var/log/netbird.out.log" - darwinStdoutLogPath = "/var/log/netbird.err.log" + darwinErrorLogPath = "/var/log/netbird.err.log" + darwinStdoutLogPath = "/var/log/netbird.out.log"
🧹 Nitpick comments (9)
shared/management/http/api/generate.sh (1)
14-14: Consider pinning the oapi-codegen version for reproducible builds.Using
@latestcan lead to non-reproducible builds as different versions may be installed at different times or in different environments. Consider pinning to a specific version tag (e.g.,@v2.4.1) to ensure consistent code generation across the team and CI/CD pipelines.Apply this diff to pin to a specific version:
-go install github.com/oapi-codegen/oapi-codegen/v2/cmd/oapi-codegen@latest +go install github.com/oapi-codegen/oapi-codegen/v2/cmd/oapi-codegen@v2.4.1Note: Replace
v2.4.1with the latest stable version you want to standardize on.shared/management/proto/management.proto (2)
52-53: Consider enhancing the RPC documentation.The comment "Executes a job on a target peer (e.g., debug bundle)" is brief. Consider expanding it to describe the bidirectional streaming nature, typical flow (request → response), and any timeout/cancellation semantics.
Example enhancement:
- // Executes a job on a target peer (e.g., debug bundle) + // Job executes asynchronous jobs on target peers (e.g., debug bundle generation). + // Bidirectional streaming: client sends JobRequest, server may send status updates, + // and finally sends JobResponse with results or failure reason. rpc Job(stream EncryptedMessage) returns (stream EncryptedMessage) {}
91-96: Clarify or rename ambiguous field names.The field names in
BundleParameterscould be clearer:
bundle_for(bool) - unclear what this flag controls. Consider renaming to something more descriptive likeinclude_statusorfull_bundle.bundle_for_time(int64) - unclear units and purpose. Is this a duration, timestamp, or time range? Consider renaming (e.g.,duration_seconds,time_window_seconds) and adding a comment.Example clarification:
message BundleParameters { - bool bundle_for = 1; - int64 bundle_for_time = 2; + bool include_full_status = 1; // Include complete peer status in bundle + int64 time_window_seconds = 2; // Time window in seconds for log collection int32 log_file_count = 3; bool anonymize = 4; }client/internal/debug/debug.go (2)
1218-1227: Anonymize non‑IP map keys to avoid leaking domains/identifiers.Keys that are domains or arbitrary strings are returned verbatim.
Apply:
func anonymizeMapKey(key string, anonymizer *anonymize.Anonymizer) string { if prefix, err := netip.ParsePrefix(key); err == nil { anonIP := anonymizer.AnonymizeIP(prefix.Addr()) return fmt.Sprintf("%s/%d", anonIP, prefix.Bits()) } if ip, err := netip.ParseAddr(key); err == nil { return anonymizer.AnonymizeIP(ip).String() } - return key + // Anonymize arbitrary string keys (e.g., domains, hostnames, IDs) + return anonymizer.AnonymizeString(key) }
774-786: Avoid repeated os.Stat in sort comparator.Precompute mod times once to reduce I/O during sorting.
Apply:
- // sort files by modification time (newest first) - sort.Slice(files, func(i, j int) bool { - fi, err := os.Stat(files[i]) - if err != nil { - log.Warnf("failed to stat rotated log %s: %v", files[i], err) - return false - } - fj, err := os.Stat(files[j]) - if err != nil { - log.Warnf("failed to stat rotated log %s: %v", files[j], err) - return false - } - return fi.ModTime().After(fj.ModTime()) - }) + // sort files by modification time (newest first) with cached stats + statTimes := make(map[string]time.Time, len(files)) + for _, f := range files { + if fi, err := os.Stat(f); err == nil { + statTimes[f] = fi.ModTime() + } else { + log.Warnf("failed to stat rotated log %s: %v", f, err) + } + } + sort.Slice(files, func(i, j int) bool { + return statTimes[files[i]].After(statTimes[files[j]]) + })shared/management/http/api/openapi.yml (2)
50-78: Make bundle timeframe params self‑consistent.Requiring both bundle_for (bool) and bundle_for_time (int) forces clients to send a meaningless time when bundle_for=false.
Consider either:
- Remove bundle_for; use a single optional capture_minutes (0 or absent = disabled), or
- Split with oneOf so bundle_for_time is required only when bundle_for=true.
Example simplification:
- BundleParameters: + BundleParameters: type: object - properties: - bundle_for: - type: boolean - description: Whether to generate a bundle for the given timeframe. - example: true - bundle_for_time: - type: integer - minimum: 1 - maximum: 5 - description: Time period in minutes for which to generate the bundle. - example: 2 + properties: + capture_minutes: + type: integer + minimum: 0 + maximum: 5 + default: 0 + description: Minutes to capture recent logs before bundling (0 disables pre‑capture). + example: 2 log_file_count: type: integer minimum: 1 maximum: 1000 description: Maximum number of log files to include in the bundle. example: 100 anonymize: type: boolean description: Whether sensitive data should be anonymized in the bundle. example: false - required: - - bundle_for - - bundle_for_time - - log_file_count - - anonymize + required: + - log_file_count + - anonymize
2353-2421: Add operationIds and minimal examples for client SDKs.Helps codegen and docs.
Apply:
/api/peers/{peerId}/jobs: get: + operationId: listPeerJobs summary: List Jobs description: Retrieve all jobs for a given peer tags: [ Jobs ] ... responses: '200': description: List of jobs content: application/json: schema: type: array items: $ref: '#/components/schemas/JobResponse' + examples: + sample: + value: [ { "id": "job_123", "created_at": "2025-10-06T13:42:26Z", "status": "pending", "triggered_by": "user_abc", "workload": { "type": "bundle", "parameters": { "log_file_count": 10, "anonymize": true } } } ] post: + operationId: createPeerJob summary: Create Job description: Create a new job for a given peer tags: [ Jobs ] ... /api/peers/{peerId}/jobs/{jobId}: get: + operationId: getPeerJob summary: Get Job description: Retrieve details of a specific job tags: [ Jobs ] ...Also applies to: 2422-2456
client/jobexec/executor.go (1)
28-31: Tighten the error log
log.Errorf("failed to upload debug bundle to %v", err)reads awkwardly (there’s no destination in the output). Consider logging the upload endpoint and attach the error viaWithError(err)so failures are easier to diagnose.client/server/debug.go (1)
50-53: Propagate the gRPC context into the uploadWe now have a chance to honour caller cancellation/timeouts—rename the parameter to
ctx context.Contextand passctxtodebug.UploadDebugBundleinstead ofcontext.Background(). That keeps the upload responsive to client-side aborts.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
go.sumis excluded by!**/*.sumshared/management/proto/management.pb.gois excluded by!**/*.pb.goshared/management/proto/management_grpc.pb.gois excluded by!**/*.pb.go
📒 Files selected for processing (50)
client/cmd/debug.go(2 hunks)client/cmd/status.go(1 hunks)client/cmd/testutil_test.go(2 hunks)client/cmd/up.go(1 hunks)client/embed/embed.go(1 hunks)client/internal/connect.go(7 hunks)client/internal/debug/debug.go(5 hunks)client/internal/debug/upload.go(1 hunks)client/internal/debug/upload_test.go(2 hunks)client/internal/engine.go(14 hunks)client/internal/engine_test.go(8 hunks)client/ios/NetBirdSDK/client.go(1 hunks)client/jobexec/executor.go(1 hunks)client/server/debug.go(2 hunks)client/server/server.go(3 hunks)client/server/server_test.go(2 hunks)client/status/status.go(6 hunks)client/status/status_test.go(1 hunks)client/ui/debug.go(3 hunks)go.mod(2 hunks)management/internals/server/boot.go(1 hunks)management/internals/server/controllers.go(1 hunks)management/internals/server/modules.go(1 hunks)management/server/account.go(3 hunks)management/server/account/manager.go(1 hunks)management/server/account_test.go(1 hunks)management/server/activity/codes.go(2 hunks)management/server/dns_test.go(1 hunks)management/server/grpcserver.go(11 hunks)management/server/http/handlers/peers/peers_handler.go(3 hunks)management/server/http/testing/testing_tools/channel/channel.go(2 hunks)management/server/jobChannel.go(1 hunks)management/server/management_proto_test.go(3 hunks)management/server/management_test.go(3 hunks)management/server/mock_server/account_mock.go(1 hunks)management/server/nameserver_test.go(1 hunks)management/server/peer.go(1 hunks)management/server/peer_test.go(4 hunks)management/server/route_test.go(1 hunks)management/server/store/sql_store.go(3 hunks)management/server/store/store.go(1 hunks)management/server/types/job.go(1 hunks)shared/management/client/client.go(1 hunks)shared/management/client/client_test.go(2 hunks)shared/management/client/grpc.go(8 hunks)shared/management/client/mock.go(2 hunks)shared/management/http/api/generate.sh(1 hunks)shared/management/http/api/openapi.yml(2 hunks)shared/management/http/api/types.gen.go(8 hunks)shared/management/proto/management.proto(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (44)
management/internals/server/controllers.go (3)
management/server/jobChannel.go (2)
JobManager(24-31)NewJobManager(33-43)management/internals/server/container.go (1)
Create(6-10)management/server/store/store.go (1)
Store(50-211)
management/server/route_test.go (3)
management/server/account.go (1)
BuildManager(186-291)management/server/updatechannel.go (1)
NewPeersUpdateManager(30-36)management/server/jobChannel.go (1)
NewJobManager(33-43)
shared/management/client/client.go (2)
management/server/types/job.go (1)
Job(28-52)shared/management/proto/management.pb.go (6)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)
client/jobexec/executor.go (3)
client/internal/debug/debug.go (3)
GeneratorDependencies(238-243)BundleConfig(231-236)NewBundleGenerator(245-265)client/internal/debug/upload.go (1)
UploadDebugBundle(17-28)upload-server/types/upload.go (1)
DefaultBundleURL(11-11)
management/server/store/store.go (1)
management/server/types/job.go (1)
Job(28-52)
client/internal/debug/upload.go (1)
upload-server/types/upload.go (3)
GetURLResponse(15-18)ClientHeader(5-5)ClientHeaderValue(7-7)
client/status/status_test.go (1)
client/status/status.go (1)
ConvertToStatusOutputOverview(108-153)
client/internal/debug/upload_test.go (2)
client/internal/debug/upload.go (1)
UploadDebugBundle(17-28)upload-server/types/upload.go (1)
GetURLPath(9-9)
management/internals/server/boot.go (2)
management/server/grpcserver.go (1)
NewServer(79-138)management/server/jobChannel.go (1)
JobManager(24-31)
client/ui/debug.go (1)
client/status/status.go (1)
ConvertToStatusOutputOverview(108-153)
management/server/peer_test.go (3)
management/server/account.go (1)
BuildManager(186-291)management/server/updatechannel.go (1)
NewPeersUpdateManager(30-36)management/server/jobChannel.go (1)
NewJobManager(33-43)
management/server/account.go (1)
management/server/jobChannel.go (1)
JobManager(24-31)
management/server/dns_test.go (3)
management/server/account.go (1)
BuildManager(186-291)management/server/updatechannel.go (1)
NewPeersUpdateManager(30-36)management/server/jobChannel.go (1)
NewJobManager(33-43)
shared/management/client/mock.go (2)
shared/management/proto/management.pb.go (6)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)management/server/types/job.go (1)
Job(28-52)
client/cmd/status.go (1)
client/status/status.go (1)
ConvertToStatusOutputOverview(108-153)
client/server/debug.go (1)
client/internal/debug/upload.go (1)
UploadDebugBundle(17-28)
management/server/mock_server/account_mock.go (1)
management/server/types/job.go (1)
Job(28-52)
client/cmd/testutil_test.go (3)
management/server/jobChannel.go (1)
NewJobManager(33-43)management/server/account.go (1)
BuildManager(186-291)management/server/grpcserver.go (1)
NewServer(79-138)
management/internals/server/modules.go (3)
management/server/account.go (1)
BuildManager(186-291)management/server/updatechannel.go (1)
PeersUpdateManager(20-27)management/server/jobChannel.go (1)
JobManager(24-31)
client/internal/debug/debug.go (1)
util/log.go (1)
SpecialLogs(25-28)
management/server/http/handlers/peers/peers_handler.go (4)
management/server/context/auth.go (1)
GetUserAuthFromContext(47-52)shared/management/http/util/util.go (3)
WriteError(84-120)WriteErrorResponse(70-80)WriteJSONObject(27-35)shared/management/http/api/types.gen.go (3)
JobRequest(708-710)JobResponse(713-721)JobResponseStatus(724-724)management/server/types/job.go (3)
NewJob(61-97)Job(28-52)Workload(54-58)
client/cmd/up.go (1)
util/log.go (1)
FindFirstLogPath(77-84)
client/server/server.go (1)
client/status/status.go (1)
ToProtoFullStatus(472-557)
management/server/management_proto_test.go (3)
management/server/jobChannel.go (1)
NewJobManager(33-43)management/server/account.go (1)
BuildManager(186-291)management/server/grpcserver.go (1)
NewServer(79-138)
management/server/http/testing/testing_tools/channel/channel.go (2)
management/server/jobChannel.go (1)
NewJobManager(33-43)management/server/account.go (1)
BuildManager(186-291)
management/server/management_test.go (1)
management/server/jobChannel.go (1)
NewJobManager(33-43)
management/server/nameserver_test.go (5)
management/server/account.go (1)
BuildManager(186-291)management/server/updatechannel.go (1)
NewPeersUpdateManager(30-36)management/server/jobChannel.go (1)
NewJobManager(33-43)management/server/integrated_validator.go (6)
MockIntegratedValidator(125-128)MockIntegratedValidator(153-155)MockIntegratedValidator(157-159)MockIntegratedValidator(161-163)MockIntegratedValidator(165-167)MockIntegratedValidator(169-171)management/server/integrations/port_forwarding/controller.go (1)
NewControllerMock(20-22)
client/cmd/debug.go (1)
client/status/status.go (1)
ConvertToStatusOutputOverview(108-153)
management/server/account/manager.go (1)
management/server/types/job.go (1)
Job(28-52)
management/server/grpcserver.go (4)
management/server/jobChannel.go (2)
JobManager(24-31)JobEvent(18-22)shared/management/proto/management_grpc.pb.go (1)
ManagementService_JobServer(427-431)shared/management/proto/management.pb.go (9)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)EncryptedMessage(322-333)EncryptedMessage(348-348)EncryptedMessage(363-365)encryption/message.go (1)
EncryptMessage(10-24)
shared/management/client/client_test.go (3)
management/server/jobChannel.go (1)
NewJobManager(33-43)management/server/account.go (1)
BuildManager(186-291)management/server/grpcserver.go (1)
NewServer(79-138)
management/server/jobChannel.go (3)
shared/management/proto/management.pb.go (6)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)management/server/store/store.go (1)
Store(50-211)management/server/types/job.go (1)
Job(28-52)
client/status/status.go (2)
client/proto/daemon.pb.go (21)
FullStatus(1768-1781)FullStatus(1794-1794)FullStatus(1809-1811)ManagementState(1578-1585)ManagementState(1598-1598)ManagementState(1613-1615)SignalState(1517-1524)SignalState(1537-1537)SignalState(1552-1554)LocalPeerState(1424-1435)LocalPeerState(1448-1448)LocalPeerState(1463-1465)PeerState(1251-1272)PeerState(1285-1285)PeerState(1300-1302)RelayState(1639-1646)RelayState(1659-1659)RelayState(1674-1676)NSGroupState(1699-1707)NSGroupState(1720-1720)NSGroupState(1735-1737)client/internal/peer/status.go (6)
FullStatus(150-160)ManagementState(127-131)SignalState(120-124)LocalPeerState(105-111)RosenpassState(134-137)NSGroupState(141-147)
management/server/store/sql_store.go (1)
management/server/types/job.go (3)
Job(28-52)JobStatusPending(17-17)JobStatusFailed(19-19)
client/internal/engine.go (7)
client/internal/profilemanager/config.go (1)
Config(83-148)client/jobexec/executor.go (2)
Executor(13-14)NewExecutor(16-18)shared/management/client/client.go (1)
Client(14-27)client/internal/state.go (1)
CtxGetState(31-33)client/status/status.go (3)
ToProtoFullStatus(472-557)ConvertToStatusOutputOverview(108-153)ParseToFullDetailSummary(455-470)version/version.go (1)
NetbirdVersion(18-20)client/internal/debug/debug.go (2)
GeneratorDependencies(238-243)BundleConfig(231-236)
management/server/types/job.go (3)
shared/management/http/api/types.gen.go (8)
JobRequest(708-710)WorkloadResponse(1948-1950)BundleParameters(361-373)BundleResult(376-378)BundleWorkloadResponse(391-399)WorkloadTypeBundle(195-195)WorkloadRequest(1943-1945)JobResponse(713-721)shared/management/status/error.go (4)
Errorf(70-75)BadRequest(36-36)InvalidArgument(27-27)Error(54-57)shared/management/proto/management.pb.go (2)
JobStatus_succeeded(29-29)JobStatus_failed(30-30)
client/internal/connect.go (2)
client/internal/engine.go (2)
NewEngine(237-278)EngineConfig(87-141)client/internal/profilemanager/config.go (1)
Config(83-148)
client/internal/engine_test.go (4)
client/internal/engine.go (2)
NewEngine(237-278)EngineConfig(87-141)management/server/jobChannel.go (1)
NewJobManager(33-43)management/server/account.go (1)
BuildManager(186-291)management/server/grpcserver.go (1)
NewServer(79-138)
management/server/peer.go (5)
management/server/account.go (1)
DefaultAccountManager(68-121)management/server/types/job.go (2)
Job(28-52)Workload(54-58)shared/management/status/error.go (7)
NewPermissionValidationError(213-215)NewPermissionDeniedError(209-211)NewPeerNotPartOfAccountError(105-107)Errorf(70-75)BadRequest(36-36)Internal(24-24)Type(46-46)management/server/store/store.go (3)
Store(50-211)LockingStrengthNone(47-47)LockingStrengthUpdate(43-43)management/server/activity/codes.go (1)
JobCreatedByUser(183-183)
shared/management/http/api/types.gen.go (2)
shared/management/proto/management.pb.go (9)
BundleParameters(554-563)BundleParameters(578-578)BundleParameters(593-595)BundleResult(625-631)BundleResult(646-646)BundleResult(661-663)JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)management/server/types/job.go (1)
Workload(54-58)
management/server/account_test.go (2)
management/server/account.go (1)
BuildManager(186-291)management/server/jobChannel.go (1)
NewJobManager(33-43)
shared/management/client/grpc.go (4)
shared/management/proto/management.pb.go (13)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)EncryptedMessage(322-333)EncryptedMessage(348-348)EncryptedMessage(363-365)JobStatus_failed(30-30)SyncResponse(721-738)SyncResponse(753-753)SyncResponse(768-770)shared/management/proto/management_grpc.pb.go (2)
ManagementService_JobClient(169-173)ManagementService_SyncClient(89-92)encryption/message.go (2)
EncryptMessage(10-24)DecryptMessage(27-40)client/system/info.go (1)
Info(45-75)
client/embed/embed.go (1)
client/internal/connect.go (1)
NewConnectClient(51-62)
client/server/server_test.go (3)
management/server/jobChannel.go (1)
NewJobManager(33-43)management/server/account.go (1)
BuildManager(186-291)management/server/grpcserver.go (1)
NewServer(79-138)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: Linux
- GitHub Check: Darwin
- GitHub Check: Windows
- GitHub Check: release_ui
- GitHub Check: release
- GitHub Check: release_ui_darwin
- GitHub Check: iOS / Build
- GitHub Check: Android / Build
- GitHub Check: Client / Unit
- GitHub Check: JS / Lint
- GitHub Check: Client / Unit
- GitHub Check: Client / Unit
- GitHub Check: Build Cache
🔇 Additional comments (27)
management/server/activity/codes.go (1)
180-183: LGTM! New activity codes follow the established pattern.The new
JobCreatedByUseractivity constant and its corresponding map entry are correctly implemented, following the existing sequential numbering scheme and formatting conventions.Also applies to: 291-294
client/ios/NetBirdSDK/client.go (1)
24-24: LGTM! Import reordering is cosmetic.This is a formatting-only change with no functional impact.
client/cmd/up.go (1)
203-203: LGTM! Log path is now correctly propagated.The second argument now passes the log path using
util.FindFirstLogPath(logFiles), aligning with the PR's objective to propagate log path information through the client connection.client/cmd/testutil_test.go (1)
88-88: LGTM! JobManager is correctly wired into test setup.The JobManager creation and integration into
BuildManagerandNewServerfollows the correct pattern, as confirmed by the relevant code snippets. Passingnilfor metrics in test setup is acceptable.Also applies to: 114-114, 120-120
client/internal/debug/upload_test.go (1)
1-1: LGTM! Test correctly updated for the refactored public API.The package rename and function call update to the exported
UploadDebugBundlecorrectly reflect the refactoring that moved upload functionality to the debug package.Also applies to: 41-41
go.mod (1)
68-68: Dependencies verified: v1.1.2 and v2.0.0 are current with no known security advisories.The versions for
oapi-codegen/runtimeandgo-jsonmerge/v2are current and have no known vulnerabilities.shared/management/http/api/openapi.yml (2)
35-38: New Jobs tag: looks good.Tag added and marked experimental; aligns with feature scope.
41-49: Schemas and discriminator mapping: solid.WorkloadType + oneOf mapping for request/response is correct and future‑proof.
If you want, I can lint this section with an OpenAPI 3.1 validator.
Also applies to: 86-107, 109-121, 123-156
management/server/route_test.go (1)
1288-1289: JobManager wiring verified across all callsites.All 16 BuildManager invocations in the codebase have been properly updated to include the JobManager parameter. Test code consistently passes
NewJobManager(nil, store)orNewJobManager(nil, s), and production code usess.JobManager(). No stale signatures remain.client/internal/debug/debug.go (1)
221-221: Field rename verified complete—no stale usages remain.The refactoring of
logFile→logPathin theBundleGeneratorstruct is complete:
- New field declared at line 221 ✓
- Initialized at line 258 ✓
- All usages within
BundleGeneratormethods correctly referenceg.logPath✓- No lingering references to the old field name found
The separate
Server.logFilefield (line 59 inserver.go) remains unchanged, which is correct—that's a different struct in a different package. The call atserver/debug.go:32correctly passess.logFileto theLogPathfield ofGeneratorDependencies.management/server/account_test.go (1)
2951-2954: JobManager wiring fits the updated constructorPassing
NewJobManager(nil, store)keeps the test helper aligned with the newBuildManagersignature—looks good.shared/management/client/mock.go (1)
23-49: Mock client covers the new Job streaming hookThe delegate pattern stays consistent with the other mock methods—thanks for extending the mock surface.
client/server/server_test.go (1)
295-323: Test setup stays in lockstep with production wiringInjecting
NewJobManager(nil, store)mirrors the runtime constructors, so the test harness remains representative. Looks good.client/cmd/status.go (1)
102-102: LGTM! Signature change aligns with refactored status conversion API.The call correctly passes
resp.GetFullStatus()andresp.GetDaemonVersion()to match the updated signature.management/server/http/testing/testing_tools/channel/channel.go (2)
47-47: LGTM! JobManager test wiring.Creating JobManager with nil metrics is appropriate for test context.
67-67: LGTM! BuildManager updated with JobManager parameter.The jobManager is correctly wired as the 4th argument, consistent with the updated BuildManager signature.
management/internals/server/boot.go (1)
148-148: LGTM! NewServer signature updated with JobManager.The JobManager is correctly positioned as the 6th parameter, aligning with the updated NewServer signature shown in relevant snippets.
shared/management/client/client_test.go (3)
72-72: LGTM! JobManager test initialization.Consistent with test wiring pattern across the codebase.
115-115: LGTM! BuildManager wired with JobManager.The jobManager is correctly passed as the 4th argument.
123-123: LGTM! NewServer wired with JobManager.The jobManager parameter is correctly positioned in the NewServer call.
management/server/store/store.go (1)
206-210: LGTM! Store interface extended with peer job methods.The five new methods provide a complete job lifecycle API: creation, completion, retrieval by ID, listing by peer, and cleanup for pending jobs. Signatures are consistent with store conventions.
management/server/management_proto_test.go (3)
431-431: LGTM! JobManager test initialization.Consistent with the test wiring pattern established across the PR.
455-455: LGTM! BuildManager wired with JobManager.The jobManager is correctly passed as the 4th argument, matching the updated signature.
466-466: LGTM! NewServer wired with JobManager.The jobManager parameter is correctly positioned in the NewServer call.
management/server/peer_test.go (1)
1292-1292: LGTM! BuildManager calls updated with JobManager.All test functions consistently wire JobManager into BuildManager using the inline
NewJobManager(nil, store)pattern. The nil metrics parameter is appropriate for test context.Also applies to: 1372-1372, 1520-1520, 1595-1595
management/server/account/manager.go (1)
129-131: LGTM! Manager interface extended with peer job methods.The three new methods provide peer job management at the account level. Including
userIDin the signatures enables proper authorization checks, which is essential for multi-user account scenarios.shared/management/http/api/types.gen.go (1)
2120-2137: Union setter keeps the discriminator consistent.Like that the helper forces
typeto"bundle"before marshalling, so consumers can’t accidentally craft an invalid variant.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
management/server/mock_server/account_mock.go (2)
144-149: Fix nil-pointer guard to check the function field, not the method.Line 145 checks
am.GetAllPeerJobs(the method itself, which is always non-nil) instead ofam.GetAllPeerJobsFunc. This means the guard always passes, and if a test leavesGetAllPeerJobsFuncunset, line 146 will panic. Additionally, the error message incorrectly references "GetAllJobs".Apply this diff:
func (am *MockAccountManager) GetAllPeerJobs(ctx context.Context, accountID, userID, peerID string) ([]*types.Job, error) { - if am.GetAllPeerJobs != nil { + if am.GetAllPeerJobsFunc != nil { return am.GetAllPeerJobsFunc(ctx, accountID, userID, peerID) } - return nil, status.Errorf(codes.Unimplemented, "method GetAllJobs is not implemented") + return nil, status.Errorf(codes.Unimplemented, "method GetAllPeerJobs is not implemented") }
150-155: Fix nil-pointer guard to check the function field, not the method.Line 151 checks
am.GetPeerJobByID(the method itself, which is always non-nil) instead ofam.GetPeerJobByIDFunc. This means the guard always passes, and if a test leavesGetPeerJobByIDFuncunset, line 152 will panic. Additionally, the error message incorrectly references "CreateJob".Apply this diff:
func (am *MockAccountManager) GetPeerJobByID(ctx context.Context, accountID, userID, peerID, jobID string) (*types.Job, error) { - if am.GetPeerJobByID != nil { + if am.GetPeerJobByIDFunc != nil { return am.GetPeerJobByIDFunc(ctx, accountID, userID, peerID, jobID) } - return nil, status.Errorf(codes.Unimplemented, "method CreateJob is not implemented") + return nil, status.Errorf(codes.Unimplemented, "method GetPeerJobByID is not implemented") }
🧹 Nitpick comments (1)
management/server/store/sql_store.go (1)
136-144: LGTM with minor comment inconsistency.The implementation correctly creates a job record with proper error handling. The comment on line 136 says "SaveJob" but the method is named "CreatePeerJob" - this is a minor inconsistency but doesn't affect functionality.
Consider updating the comment to match the method name:
-// SaveJob persists a job in DB +// CreatePeerJob persists a job in DB
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
management/server/mock_server/account_mock.go(1 hunks)management/server/store/sql_store.go(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
management/server/store/sql_store.go (1)
management/server/types/job.go (3)
Job(28-52)JobStatusPending(17-17)JobStatusFailed(19-19)
management/server/mock_server/account_mock.go (1)
management/server/types/job.go (1)
Job(28-52)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
- GitHub Check: Management / Benchmark (API) (amd64, sqlite)
- GitHub Check: Management / Integration (amd64, postgres)
- GitHub Check: Management / Benchmark (API) (amd64, postgres)
- GitHub Check: Relay / Unit (386)
- GitHub Check: Relay / Unit (amd64, -race)
- GitHub Check: Management / Unit (amd64, sqlite)
- GitHub Check: Client / Unit (amd64)
- GitHub Check: Client (Docker) / Unit
- GitHub Check: Management / Unit (amd64, postgres)
- GitHub Check: Management / Benchmark (amd64, sqlite)
- GitHub Check: Client / Unit (386)
- GitHub Check: Management / Unit (amd64, mysql)
- GitHub Check: Management / Benchmark (amd64, postgres)
- GitHub Check: Darwin
- GitHub Check: Linux
- GitHub Check: Windows
- GitHub Check: Client / Unit
- GitHub Check: JS / Lint
- GitHub Check: release_ui_darwin
- GitHub Check: release
- GitHub Check: Client / Unit
- GitHub Check: Android / Build
- GitHub Check: iOS / Build
- GitHub Check: Client / Unit
🔇 Additional comments (4)
management/server/store/sql_store.go (4)
49-49: LGTM!The query condition constant is correctly defined and follows the established pattern in this file.
117-117: LGTM!The Job type is correctly added to the AutoMigrate list to ensure the jobs table is created during migration.
164-164: Past issue resolved - WHERE clause is now correct.The past review comment flagged a missing space before "AND", but this has been fixed. The concatenation
accountAndPeerIDQueryCondition+" AND status = ?"correctly produces"account_id = ? and peer_id = ? AND status = ?"with proper spacing.
177-207: LGTM!Both
GetPeerJobByIDandGetPeerJobsmethods are correctly implemented:
- Proper use of query conditions and parameterized queries
- Appropriate error handling with specific error types for not found cases
- GetPeerJobs correctly orders results by
created_at DESCfor chronological listing- Consistent logging and error patterns with the rest of the store implementation
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (2)
management/internals/shared/grpc/server.go (2)
184-219: Missing peer lock synchronization - potential race condition.Unlike
Sync()(line 282),Job()does not acquire a peer lock before processing. This could lead to race conditions if multiple concurrent Job streams are established for the same peer.Consider adding peer lock acquisition similar to the Sync flow, immediately after retrieving the peer (line 205) and before starting the response receiver and creating job channels. Store the returned unlock function and defer its call, then unlock before entering the main loop.
Based on past review comments.
206-208: Inconsistent error codes for peer registration checks.Lines 206-208 return
codes.Unauthenticatedwhen the peer is not found, while lines 198-199 returncodes.PermissionDeniedfor a similar peer-not-registered scenario. TheSync()method consistently usescodes.PermissionDenied(line 270) for both cases.Apply this diff for consistency:
peer, err := s.accountManager.GetStore().GetPeerByPeerPubKey(ctx, store.LockingStrengthNone, peerKey.String()) if err != nil { - return status.Errorf(codes.Unauthenticated, "peer is not registered") + return status.Errorf(codes.PermissionDenied, "peer is not registered") }Based on past review comments.
🧹 Nitpick comments (1)
management/internals/shared/grpc/server.go (1)
374-401: Consider immediate exit on send failure.When
sendJobfails (line 390), the channel is closed (line 391) and the error is logged (line 392), but the loop continues. While the closed channel will cause the next iteration to exit, returning immediately after closing the channel would make the flow more explicit and efficient.Apply this diff for clarity:
if err := s.sendJob(ctx, accountID, peerKey, peer, job, srv); err != nil { s.jobManager.CloseChannel(ctx, accountID, peer.ID) log.WithContext(ctx).Warnf("send job failed: %v", err) + return err }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
management/internals/shared/grpc/server.go(11 hunks)management/server/mock_server/account_mock.go(2 hunks)management/server/peer.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- management/server/peer.go
🧰 Additional context used
🧬 Code graph analysis (2)
management/server/mock_server/account_mock.go (1)
management/server/types/job.go (1)
Job(34-58)
management/internals/shared/grpc/server.go (5)
management/server/job/jobChannel.go (2)
Manager(25-32)Event(19-23)management/server/account/manager.go (1)
Manager(27-131)shared/management/proto/management_grpc.pb.go (1)
ManagementService_JobServer(427-431)shared/management/status/error.go (7)
FromError(78-87)NotFound(21-21)Errorf(70-75)PermissionDenied(18-18)Unauthenticated(39-39)InvalidArgument(27-27)Internal(24-24)encryption/message.go (1)
EncryptMessage(10-24)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (25)
- GitHub Check: Client / Unit (386)
- GitHub Check: Client (Docker) / Unit
- GitHub Check: Relay / Unit (amd64, -race)
- GitHub Check: Client / Unit (amd64)
- GitHub Check: Management / Benchmark (amd64, postgres)
- GitHub Check: Relay / Unit (386)
- GitHub Check: Management / Benchmark (amd64, sqlite)
- GitHub Check: Management / Unit (amd64, mysql)
- GitHub Check: Management / Integration (amd64, sqlite)
- GitHub Check: Management / Integration (amd64, postgres)
- GitHub Check: Management / Unit (amd64, postgres)
- GitHub Check: Management / Benchmark (API) (amd64, postgres)
- GitHub Check: Management / Unit (amd64, sqlite)
- GitHub Check: Management / Benchmark (API) (amd64, sqlite)
- GitHub Check: Windows
- GitHub Check: Darwin
- GitHub Check: Linux
- GitHub Check: Client / Unit
- GitHub Check: iOS / Build
- GitHub Check: Android / Build
- GitHub Check: Client / Unit
- GitHub Check: Client / Unit
- GitHub Check: release_ui_darwin
- GitHub Check: release
- GitHub Check: JS / Lint
🔇 Additional comments (2)
management/server/mock_server/account_mock.go (2)
133-136: LGTM! Field declarations are well-structured.The new mock function fields follow the established pattern in this file and have appropriate signatures for peer job operations.
138-156: LGTM! All previously identified issues have been resolved.The three peer job methods now correctly:
- Check their own function pointers (not CreatePeerJobFunc for all three)
- Use accurate error messages matching the method names
- Delegate with proper parameter ordering
The implementation follows the established mock pattern consistently.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (4)
shared/management/proto/management.proto (1)
66-89: Address the field naming and type consistency issues.This duplicates a previous review comment. The field naming remains inconsistent:
JobRequest.IDandJobResponse.ID,JobResponse.Reasonuse PascalCaseBundleParametersfields use snake_casebytestype forIDandReasonmakes debugging harder (IDs are typically UUIDs as strings, reasons are error messages)Protobuf style guides recommend snake_case for all fields and
stringfor human-readable identifiers and messages.Based on past review comments.
client/internal/engine.go (1)
979-1012: Add defensive nil checks before dereferencing ProfileConfig and ManagementURL.This addresses the previous review concern.
handleBundledereferencese.config.ProfileConfig(line 987, 1001) ande.config.ProfileConfig.ManagementURL(line 1001) without nil checks, which can cause panics if the engine is created with incomplete configuration.Apply this diff to add defensive checks:
func (e *Engine) handleBundle(params *mgmProto.BundleParameters) (*mgmProto.JobResponse_Bundle, error) { log.Infof("handle remote debug bundle request: %s", params.String()) + if e.config.ProfileConfig == nil { + return nil, errors.New("profile config is not set") + } + if e.config.ProfileConfig.ManagementURL == nil { + return nil, errors.New("management URL is not configured") + } syncResponse, err := e.GetLatestSyncResponse() if err != nil { log.Warnf("get latest sync response: %v", err) } bundleDeps := debug.GeneratorDependencies{ InternalConfig: e.config.ProfileConfig, StatusRecorder: e.statusRecorder, SyncResponse: syncResponse, LogPath: e.config.LogPath, }Based on past review comments.
management/internals/shared/grpc/server.go (2)
184-219: Add peer lock acquisition for race-free job handling.This duplicates a previous review comment. Unlike
Sync()(line 282), theJob()method does not acquire a per-peer lock before processing. This can lead to race conditions if multiple concurrent Job streams are established for the same peer.Consider acquiring the peer lock after resolving the peer (around line 209) and releasing it before entering
sendJobsLoop, similar to the Sync flow:func (s *Server) Job(srv proto.ManagementService_JobServer) error { ... // nolint:staticcheck ctx = context.WithValue(ctx, nbContext.AccountIDKey, accountID) + // nolint:staticcheck + ctx = context.WithValue(ctx, nbContext.PeerIDKey, peerKey.String()) + + unlock := s.acquirePeerLockByUID(ctx, peerKey.String()) + defer func() { + if unlock != nil { + unlock() + } + }() + peer, err := s.accountManager.GetStore().GetPeerByPeerPubKey(ctx, store.LockingStrengthNone, peerKey.String()) if err != nil { return status.Errorf(codes.PermissionDenied, "peer is not registered") } // Start background response handler s.startResponseReceiver(ctx, srv) // Prepare per-peer state updates := s.jobManager.CreateJobChannel(ctx, accountID, peer.ID) log.WithContext(ctx).Debugf("Job: took %v", time.Since(reqStart)) + + unlock() + unlock = nil // Main loop: forward jobs to client return s.sendJobsLoop(ctx, accountID, peerKey, peer, updates, srv) }Based on past review comments.
198-201: Use consistent error codes for peer registration checks.This duplicates a previous review comment. Line 199 returns
codes.PermissionDeniedfor a not-found peer, while line 207 returnscodes.Unauthenticatedfor the same scenario. TheSync()method consistently usescodes.PermissionDenied(line 270) for both cases.Apply this diff for consistency:
peer, err := s.accountManager.GetStore().GetPeerByPeerPubKey(ctx, store.LockingStrengthNone, peerKey.String()) if err != nil { - return status.Errorf(codes.Unauthenticated, "peer is not registered") + return status.Errorf(codes.PermissionDenied, "peer is not registered") }Based on past review comments.
Also applies to: 206-208
🧹 Nitpick comments (3)
management/server/http/testing/testing_tools/channel/channel.go (1)
54-54: Consider passingmetricsinstead ofnilfor consistency.The
metricsvariable is already initialized on line 48 and passed toBuildManageron line 78. For consistency and to avoid potential nil-pointer issues if the JobManager attempts to emit metrics, consider passingmetricshere as well.Apply this diff:
- jobManager := job.NewJobManager(nil, store) + jobManager := job.NewJobManager(metrics, store)client/server/server_test.go (1)
298-298: Consider passing metrics to JobManager for consistency.The
jobManageris created withnilmetrics, but ametricsinstance is created just below (lines 313-314). Consider reordering the initialization to pass the actual metrics instance toNewJobManager, making the test setup more consistent with production usage.Apply this diff to improve consistency:
t.Cleanup(cleanUp) - jobManager := job.NewJobManager(nil, store) eventStore := &activity.InMemoryEventStore{} if err != nil { return nil, "", err } ctrl := gomock.NewController(t) t.Cleanup(ctrl.Finish) permissionsManagerMock := permissions.NewMockManager(ctrl) peersManager := peers.NewManager(store, permissionsManagerMock) settingsManagerMock := settings.NewMockManager(ctrl) ia, _ := integrations.NewIntegratedValidator(context.Background(), peersManager, settingsManagerMock, eventStore) metrics, err := telemetry.NewDefaultAppMetrics(context.Background()) require.NoError(t, err) + + jobManager := job.NewJobManager(metrics, store)management/server/account_test.go (1)
2964-2964: Consider reusing the available metrics instance.The change correctly wires the JobManager dependency into the test setup. However, since
metricsis already created (line 2939) and passed to BuildManager, you could pass it to NewJobManager as well for consistency:-manager, err := BuildManager(ctx, nil, store, networkMapController, job.NewJobManager(nil, store), nil, "", eventStore, nil, false, MockIntegratedValidator{}, metrics, port_forwarding.NewControllerMock(), settingsMockManager, permissionsManager, false) +manager, err := BuildManager(ctx, nil, store, networkMapController, job.NewJobManager(metrics, store), nil, "", eventStore, nil, false, MockIntegratedValidator{}, metrics, port_forwarding.NewControllerMock(), settingsMockManager, permissionsManager, false)This would provide consistent telemetry across all components. Note that this pattern of passing nil for metrics appears in other test files as well.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
shared/management/proto/management.pb.gois excluded by!**/*.pb.go
📒 Files selected for processing (16)
client/cmd/testutil_test.go(3 hunks)client/internal/engine.go(13 hunks)client/internal/engine_test.go(9 hunks)client/server/server_test.go(3 hunks)management/internals/server/controllers.go(3 hunks)management/internals/shared/grpc/server.go(11 hunks)management/server/account_test.go(2 hunks)management/server/dns_test.go(2 hunks)management/server/http/testing/testing_tools/channel/channel.go(3 hunks)management/server/management_proto_test.go(4 hunks)management/server/management_test.go(4 hunks)management/server/nameserver_test.go(2 hunks)management/server/peer_test.go(5 hunks)management/server/route_test.go(2 hunks)shared/management/client/client_test.go(3 hunks)shared/management/proto/management.proto(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- management/server/nameserver_test.go
- shared/management/client/client_test.go
- management/internals/server/controllers.go
- management/server/management_proto_test.go
- management/server/dns_test.go
- management/server/route_test.go
🧰 Additional context used
🧬 Code graph analysis (9)
management/internals/shared/grpc/server.go (7)
management/server/job/jobChannel.go (2)
Manager(25-32)Event(19-23)management/server/account/manager.go (1)
Manager(27-131)shared/management/proto/management_grpc.pb.go (1)
ManagementService_JobServer(427-431)shared/management/status/error.go (7)
FromError(78-87)NotFound(21-21)Errorf(70-75)PermissionDenied(18-18)Unauthenticated(39-39)InvalidArgument(27-27)Internal(24-24)management/server/store/store.go (1)
LockingStrengthNone(47-47)shared/management/proto/management.pb.go (9)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)EncryptedMessage(322-333)EncryptedMessage(348-348)EncryptedMessage(363-365)encryption/message.go (1)
EncryptMessage(10-24)
management/server/http/testing/testing_tools/channel/channel.go (2)
management/server/job/jobChannel.go (1)
NewJobManager(34-44)management/server/account.go (1)
BuildManager(180-268)
management/server/peer_test.go (2)
management/server/account.go (1)
BuildManager(180-268)management/server/job/jobChannel.go (1)
NewJobManager(34-44)
client/internal/engine.go (5)
client/internal/profilemanager/config.go (1)
Config(89-160)client/jobexec/executor.go (3)
Executor(23-24)NewExecutor(26-28)ErrJobNotImplemented(20-20)shared/management/client/client.go (1)
Client(14-27)shared/management/proto/management.pb.go (21)
JobRequest(388-398)JobRequest(413-413)JobRequest(428-430)JobResponse(463-475)JobResponse(490-490)JobResponse(505-507)JobStatus_failed(30-30)JobRequest_Bundle(457-459)JobRequest_Bundle(461-461)JobStatus_succeeded(29-29)BundleParameters(554-563)BundleParameters(578-578)BundleParameters(593-595)JobResponse_Bundle(548-550)JobResponse_Bundle(552-552)SyncResponse(721-738)SyncResponse(753-753)SyncResponse(768-770)BundleResult(625-631)BundleResult(646-646)BundleResult(661-663)client/internal/debug/debug.go (2)
GeneratorDependencies(238-243)BundleConfig(232-236)
management/server/account_test.go (2)
management/server/account.go (1)
BuildManager(180-268)management/server/job/jobChannel.go (1)
NewJobManager(34-44)
client/cmd/testutil_test.go (3)
management/server/job/jobChannel.go (1)
NewJobManager(34-44)management/server/account.go (1)
BuildManager(180-268)management/internals/shared/grpc/server.go (1)
NewServer(84-147)
client/server/server_test.go (3)
management/server/job/jobChannel.go (1)
NewJobManager(34-44)management/server/account.go (1)
BuildManager(180-268)management/internals/shared/grpc/server.go (1)
NewServer(84-147)
client/internal/engine_test.go (5)
client/internal/engine.go (2)
NewEngine(235-276)EngineConfig(83-142)shared/management/client/mock.go (1)
MockClient(13-24)management/server/job/jobChannel.go (1)
NewJobManager(34-44)management/server/account.go (1)
BuildManager(180-268)management/internals/shared/grpc/server.go (1)
NewServer(84-147)
management/server/management_test.go (1)
management/server/job/jobChannel.go (1)
NewJobManager(34-44)
🔇 Additional comments (32)
management/server/http/testing/testing_tools/channel/channel.go (2)
19-19: LGTM!The import addition is necessary for the JobManager creation below.
78-78: LGTM!The
BuildManagercall now has the correct argument order withjobManagerproperly positioned as the 5th parameter.client/server/server_test.go (3)
21-21: LGTM!The import addition is necessary and correctly supports the JobManager integration below.
322-322: LGTM!The
BuildManagercall correctly includes thejobManagerparameter in the proper position, and all parameters align with the expected signature.
328-328: LGTM!The
nbgrpc.NewServercall correctly includes thejobManagerparameter in the proper position, and all parameters align with the expected signature.management/server/management_test.go (4)
31-31: LGTM!The import is correctly added to support the JobManager initialization.
212-228: LGTM!The jobManager is correctly wired into the BuildManager call, aligning the test setup with the new JobManager dependency.
235-247: LGTM!The jobManager is correctly wired into the NewServer initialization, ensuring the test server has access to the job management functionality.
183-183: No issue found—passing nil metrics is safe.The
metricsfield is defined in the Manager struct but is never accessed in any of its methods (CreateJobChannel, SendJob, HandleResponse, CloseChannel, cleanup, IsPeerConnected, IsPeerHasPendingJobs). Therefore, passingnilfor the metrics parameter will not cause any nil pointer dereferences or runtime errors. The test is correct as written.shared/management/proto/management.proto (1)
51-53: LGTM! The bidirectional streaming Job RPC is appropriate for job dispatch.The use of
stream EncryptedMessagefor both directions allows the server to push job requests and the client to stream back responses.client/internal/engine.go (7)
138-142: LGTM! EngineConfig extended appropriately for debug bundle support.The
ProfileConfigandLogPathfields are well-documented and align with the debug bundle generation feature.
205-222: LGTM! Proper synchronization added for sync response persistence and job execution.The dedicated
syncRespMuxseparates sync response locking from the mainsyncMsgMux, and thejobExecutorWGensures clean shutdown of job goroutines.
234-234: LGTM! NewEngine signature extended to support profile configuration.The additional
c *profilemanager.Configparameter enables debug bundle generation and is properly wired into the Engine.
335-336: LGTM! Proper job goroutine cleanup on shutdown.Waiting for
jobExecutorWGbefore closing resources ensures clean shutdown of the job execution path.
800-814: LGTM! Proper read/write lock usage for sync response persistence.The dedicated
syncRespMuxcorrectly guards both reads and writes topersistSyncResponseandlatestSyncResponse, preventing races.
942-977: LGTM! Job event handling properly addresses unknown workload types.The default case now correctly returns a
JobStatus_failedresponse withErrJobNotImplementedas the reason, addressing the previous review concern about returningnilfor unsupported job types.Based on past review comments.
1873-1911: LGTM! Sync response persistence properly synchronized.
SetSyncResponsePersistenceandGetLatestSyncResponsecorrectly usesyncRespMuxto guard reads/writes, and cloning the response (line 1905) prevents external mutation.management/internals/shared/grpc/server.go (5)
63-63: LGTM! JobManager properly integrated into Server.The jobManager dependency is cleanly injected through
NewServerand stored on the Server struct.Also applies to: 89-89, 130-130
333-346: LGTM! Handshake helper cleanly extracts peer key.The
handleHandshakemethod properly validates the initial hello message and extracts the peer key for authentication.
348-372: LGTM! Response receiver properly handles bidirectional job stream.The background goroutine correctly receives and forwards
JobResponsemessages to theJobManager, with appropriate error handling.
374-401: LGTM! Job forwarding loop properly handles channel closure and context cancellation.The
sendJobsLoopcorrectly forwards job events to the client and cleans up the job channel on errors or disconnection.
455-474: LGTM! Job encryption and sending properly handle errors.The
sendJobmethod now correctly propagates encryption errors (line 462) instead of masking them, and the log message typo has been fixed (line 472). Both issues from the previous review are addressed.Based on past review comments.
client/internal/engine_test.go (3)
252-252: LGTM! Test NewEngine calls properly updated for extended signature.All test cases consistently pass
nil, nilfor the newchecksandprofileConfigparameters, which is appropriate for tests that don't exercise the debug bundle functionality.Also applies to: 420-420, 639-639, 804-804, 1006-1006, 1532-1532
1591-1591: LGTM! JobManager properly initialized for test infrastructure.Creating the
JobManagerwithnilmetrics and passing the teststoreis appropriate for test setup.
1622-1622: LGTM! Test harness properly wired with JobManager dependency.Both
BuildManagerandNewServernow receive thejobManagerparameter, matching the updated production signatures.Also applies to: 1628-1628
management/server/account_test.go (1)
35-35: LGTM: Import correctly added for JobManager dependency.The job package import is necessary for the NewJobManager usage in the test setup.
client/cmd/testutil_test.go (4)
19-19: LGTM: Import correctly added.
92-92: LGTM: JobManager correctly initialized.The JobManager is properly created and will be wired into both BuildManager and NewServer calls.
123-123: LGTM: BuildManager correctly updated with JobManager parameter.
129-129: LGTM: NewServer correctly updated with JobManager parameter.management/server/peer_test.go (2)
35-35: LGTM: Import correctly added.
1296-1296: LGTM: All BuildManager calls correctly updated with JobManager parameter.The changes consistently wire the JobManager dependency into the test setup across multiple test functions. The pattern of passing
job.NewJobManager(nil, store)is consistent with the approach in other test files.Note: Similar to
account_test.go, metrics are created in these test functions but nil is passed to NewJobManager. While the current implementation works fine for tests, passing the available metrics instance would provide more consistent telemetry (this is an optional improvement, not a requirement).Also applies to: 1381-1381, 1534-1534, 1614-1614
Introduce a Channel abstraction to wrap job event channels with proper error handling. This prevents panics when reading from or closing channels by using explicit error returns instead of relying on channel close signals. Key improvements: - Add Channel struct with Event(), AddEvent(), and Close() methods - Replace select-based channel reads with Channel.Event() that returns ErrJobChannelClosed - Use sync.Once in Channel.Close() to prevent double-close panics - Remove redundant CloseChannel() calls from error paths in sendJob() - Simplify sendJobsLoop() and sendJob() function signatures
|
…) (#4832) This PR adds the ability to trigger debug bundle generation remotely from the Management API/Dashboard.



Describe your changes
Issue ticket number and link #4354
Stack
feature/remote-debug-release-Checklist
Documentation
Select exactly one:
Docs PR URL (required if "docs added" is checked)
Paste the PR link from https://github.com/netbirdio/docs here:
https://github.com/netbirdio/docs/pull/__
Summary by CodeRabbit
New Features
Improvements
Chores